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A key requirement for scalable quantum computing is that elementary quantum gates can 
be implemented with sufficiently low error One method for determining the error behavior 
^ . of a gate implementation is to perform process tomography. However, standard process 

Q ! tomography is limited by errors in state preparation, measurement and one-qubit gates. It 

CN I suffers from inefficient scaling with number of qubits and does not detect adverse error- 

compounding when gates are composed in long sequences. An additional problem is due 
to the fact that desirable error probabilities for scalable quantum computing are of the order 
^ . of 0.0001 or lower Experimentally proving such low errors is challenging. We describe 

a randomized benchmarking method that yields estimates of the computationally relevant 
errors without relying on accurate state preparation and measurement. Since it involves long 
sequences of randomly chosen gates, it also verifies that error behavior is stable when used 
C ■ in long computations. We implemented randomized benchmarking on trapped atomic ion 

^ . qubits, establishing a one-qubit error probability per randomized it/2 pulse of 0.00482(17) 

O^. in a particular experiment. We expect this error probability to be readily improved with 

straightforward technical modifications. 

> 

m 

g : I. INTRODUCTION 

^ . In principle, quantum computing can be used to solve computational problems having no 

! known efficient classical solutions, such as factoring and quantum physics simulations, and to 
I significantly speed up unstructured searches and Monte-Carlo simulations [[li 0, H] ■ In order to 
realize these advantages of quantum computing, we need to coherently control large numbers of 
^ . qubits for many computational steps. The smallest useful instances of the above-mentioned algo- 
rithmic applications require hundreds of qubits and many millions of steps. A quantum computing 
technology that realistically can be used to implement sufficiently large quantum computations is 
said to be "scalable". Current quantum computing technologies that promise to be scalable have 
demonstrated preparation of nontrivial quantum states of up to 8 qubits [5], but it is not yet possible 
to apply more than a few sequential two-qubit gates without excessive loss of coherence. Although 
there have been experiments to determine the behavior of isolated gates applied to prepared initial 
states [BiliaSEIiilEHHH, there have been no experiments to determine the noise 



affecting gates in a general computational context. 

An important challenge of quantum computing experiments is to physically realize gates that 
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have low error whenever and wherever they are applied. Studies of fault-tolerant quantum com- 
puting suggest that in order to avoid excessive resource overheads, the probability of error per 
unitary gate should be well below 10~^ [T^, 17, 1_8]. The current consensus is that it is a good idea 
to aim for error probabilities below 10~^. What experiments can be used to verify such low error 
probabilities? One approach is to use process tomography to establish the complete behavior of a 
quantum gate. This requires that the one-qubit gates employed in the tomography have lower error 
than the bound to be established on the gate under investigation. If this requirement is met, process 
tomography gives much useful information about the behavior of the gate, but fails to establish that 
the gate will work equally well in every context where it may be required. Process tomography 
can also be very time consuming as its complexity scales exponentially with the number of qubits. 

We propose a randomized benchmarking method to determine the error probability per gate in 
computational contexts. Randomization has been suggested as a tool for characterizing features 
of quantum noise in [|l9l]. The authors propose implementing random unitary operators U fol- 
lowed by their inverses U^^. Under the assumption that the noise model can be represented by a 
quantum operation acting independently between the implementations of U and f/~\ the effect of 
the randomization is to depolarize the noise. The average fidelity of the process applied to a pure 
initial state is the same as the average over pure states of the fidelity of the noise operation. (The 
latter average is known as the average fidelity and is closely related to the entanglement fidelity of 
an operation [|20ll .) They also show that the average fidelity can be obtained with few random ex- 
periments. They then consider self-inverting sequences of random unitary operations of arbitrary 
length. Assuming that the noise can be represented by quantum operations that do not depend 
on the choice of unitaries, the fidelity-decay of the sequence is shown to represent the strength of 
the noise. Our randomized benchmarking procedure simplifies this procedure by restricting the 
unitaries to Clifford gates and by not requiring that the sequence is strictly self-inverting. An al- 
ternative approach to verifying that sequences of gates realize the desired quantum computation is 
given in [21]. In this approach, successively larger parts of quantum networks are verified by mak- 
ing measurements involving their action on entangled states. This "self testing" strategy is very 
powerful and provably works under minimal assumptions on gate noise. It is theoretically efficient 
but requires significantly more resources and multisystem control than randomized benchmarking. 

Our randomized benchmarking method involves applying random sequences of gates of vary- 
ing lengths to a standard initial state. Each sequence ends with a randomized measurement that 
determines whether the correct final state was obtained. The average computationally relevant 
error per gate is obtained from the increase in error probability of the final measurements as a 
function of sequence length. The random gates are taken from the Clifford group [22], which 
is generated by 7r/2 rotations of the form e""^'^/'^ with a a product of Pauli operators acting on 
different qubits. The restriction to the Clifford group ensures that the measurements can be of 
one-qubit Pauli operators that yield at least one deterministic one-bit answer in the absence of 
errors. The restriction is justified by the fact that typical fault-tolerant architectures (those based 
on stabilizer codes) are most sensitive to errors in elementary Clifford gates such as the controlled 
NOT. Provided the errors in these gates are tolerated, other gates needed for universality are read- 
ily implemented [lla 12311 . Note that the results of i\% hold if the unitaries are restricted to the 
Clifford group, because the Clifford group already has the property that noise is depolarized. We 
believe that randomized benchmarking yields computationally relevant errors even when the noise 
is induced by, and depends on, the gates, as is the case in practice. 

Randomized benchmarking as discussed and implemented here gives an overall average fidelity 
for the noise in gates. To obtain more specific information, the technique needs to be refined. 
In [|24] . randomization by error-free one-qubit unitaries is used to obtain more detailed information 
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about noise acting on a multiqubit system. Randomized benchmarking can be adapted to use 
similar strategies. 

II. RANDOMIZED BENCHMARK OF ONE QUBIT 

For one qubit, our randomized benchmarking procedure consists of a large number of experi- 
ments, where each experiment consists of a pulse sequence that requires preparing an initial quan- 
tum state p, applying an alternating sequence of either major-axis tt pulses or identity operators 
("Pauli randomization") and 7r/2 pulses ("computational gates"), and performing a final measure- 
ment M. The pulse sequence between state preparation and measurement begins and ends with vr 
pulses. For one qubit, the initial state is |0). Because the major-axis tt- and 7r/2 rotations are in 
the Clifford group, the state is always an eigenstate of a Pauli operator during the pulse sequence. 
The Pauli randomization applies unitary operators ("Pauli pulses") that are (ideally) of the form 
g±i(T67r/2^ where the sign ± and b = 0, x,y, z are chosen uniformly at random and we define to 
be the identity operator. For ideal pulses, the choice of sign determines only a global phase. How- 
ever, in an implementation, the choice of sign can determine a physical setting that may affect the 
error behavior. The computational gates are 7r/2 pulses of the form e^*'^"'^/'^, with u = x,y. The 
sign and u are chosen uniformly at random, except for the last 7r/2 pulse, where u is chosen so that 
the final state is an eigenstate of cr^. The computational gates generate the Clifford group for one 
qubit. Their choice is motivated by the fact that they are experimentally implementable as simple 
pulses. The final measurement is a von Neumann measurement of a^. The last n/2 pulse ensures 
that, in the absence of errors, the measurement has a known, deterministic outcome for a given 
pulse sequence. However, the randomization of the pulse sequence ensures that the outcome is not 
correlated with any individual pulse or proper subsequence of pulses. 

The length / of a randomized pulse sequence is its number of 7r/2 pulses. The 7r/2 pulses are 
considered to be the ones that advance a computation. The tt pulses serve only to randomize the er- 
rors. One can view their effect as being no more than a change of the Pauli frame. The Pauli frame 
consists of the Pau li op erator that needs to be applied to obtain the intended computational state in 
the standard basis nlw . We call the 7r/2 and Pauli pulse combinations randomized computational 
gates. In principle, we can determine a pulse error rate by performing N experiments for each 
length / = 1, . . . , L to estimate the average probability pi of the incorrect measurement outcome 
(or "error probability") for sequences of length /. The relationship between / and pi can be used 
to obtain an average probability of error per pulse. Suppose that all errors are independent and 
depolarizing. Let the depolarization probability of an operation A be (Ia and consider a specific 
pulse sequence consisting of operations Aq, A1A2, . . . , A21+1A21+2, ^21+3, where Aq is the state 
preparation, A1A2 and the following pairs are the randomized computational gates, and A21+3 the 
measurement. For the measurement, we can assume that the error immediately precedes a perfect 
measurement. The state after A^ is a known eigenstate of a Pauli operator or completely depolar- 
ized. Depolarization of the state is equivalent to applying a random Pauli or identity operator, each 
with probability 1/4. The probability of the state's not having been depolarized is nj=o(-'^ ~ c^Aj )- 
In particular, we can express pi = E{{1 — Y[f=o(^ ~ '^Aj))/2), where the function E{.) gives the 
expectation over the random choices of the Aj. The factor of 1/2 in the expression for pi arises 
because depolarization results in the correct state 1/2 of the time. The choices of the Aj are inde- 
pendent except for the last n/2 pulse. Assume that the depolarization probability of the last tt/2 
pulse does not depend on the previous pulses. We can then write = (1 — (1 — di{)(l — d)'')/2, 
where d is the average depolarization probability of a random combination of one tx /2- and one 
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Pauli pulse (a randomized computational gate), and dif combines the depolarization probabilities 
of the preparation, initial Pauli pulse and measurement. Thus pi decays exponentially to 1/2, and 
the decay constant yields d. 

A commonly used metric to describe the deviation of an implemented gate from the intended 
gate is the average fidelity Fa, which is defined as the uniform average over pure input states of 
the fidelity of the output state with respect to the intended output state. We are interested in the 
average computationally relevant error per step consisting of a randomized computational gate 
("average error" for short). This is given by the expectation over gates of 1 — and relates to the 
depolarization parameter d of the previous paragraph by 1 — = d/2. In our implementation of 
the randomized computational gates, the tt pulses around the ^-axis are implemented by changes 
in rotating frame and do not involve actively applying a pulse. Therefore, on average, the angular 
distance of the randomized gate's action is tt. As a result, (1 — d/2) represents the average fidelity 
of pulses with action tt. 

Although estimates of pi are sufficient to obtain the average error for a randomized computa- 
tional gate, it is useful to consider the error behavior of specific randomized computations and even 
fixed instances of the randomized sequences. For this purpose, the sequences are generated by first 
producing Nq random sequences consisting of L random computational gates, where the gates are 
chosen independently without considering the final state. These sequences are considered to be 
a sample of typical computations. Each sequence is then truncated at different lengths. For each 
length, a 7r/2 pulse is appended to ensure that the final state is an eigenstate of a^. The sign of 
this final pulse is random. The resulting sequences are randomized by inserting the random Pauli 
pulses. We can then perform experiments to determine the probability of incorrect measurement 
outcomes for each such sequence and for each truncated computation after randomization by Pauli 
pulses. To be specific, the procedure is implemented as follows: 

Randomized benchmarking for one qubit: This obtains measurement statistics for NoNiNpNe 

experiments, where Nq is the number of different computational gate sequences, Ni is the 
number of lengths to which the sequences are truncated, Np is the number of Pauli ran- 
domizations for each gate sequence, and Ng is the number of experiments for each specific 
sequence. 

1. Pick a set of lengths li < I2 < ■ ■ ■ < Ini- The goal is to determine the probability of error of 
randomized computations of each length. 

2. Do the following for each j = 1, . . . , Nq: 

2.a. Choose a random sequence Q = {Gi, . . .} of /jvj — 1 computational gates. 

2.b. For each /c = 1, . . . , A^"; do the following: 

2.b. 1. Determine the final state p/ obtained by applying Gi^, . . . Gi to |0), assuming no 
error. 

2.b.2. Randomly pick a final computational gate R among the two ±a;, ±.y, ±z axis 
7r/2 pulses that result in an eigenstate of cr^ when applied to p/. Record which 
eigenstate is obtained. 
2.b.3. Do the following for each m — 1, . . . Np: 

2.b.3.a Choose a random sequence V = {Pi, . . .} of + 2 Pauli pulses. 
2.b.3.b. Experimentally implement the pulse sequence that applies 
Pi^+2RPiu+iGi^ ■ ■ - GiPi to |0) and measures cr^, repeating the experi- 
ment A^e times. 
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2.b.3.c. From the experimental data and the expected outcome of the experiments 
in the absence of errors (from step 2.b.2), obtain an estimate Pj,i^.m of the 
probability of error. Record the uncertainty of this estimate. 

The probabilities of error pi are obtained from the Pj,i^,m by averaging pi^, = 
XljSi J2m=iPj,ik,m/{NGNp). We also obtain the probabilities of error for each computational 

gate sequence, j^. = J2m=i Pj,ik,m/Np. If the errors are independent and depolarizing, the Pj,i,,,m 
and the pj i^ should not differ significantly from the pi^. . However, if the errors are systematic in 
the sense that each implemented pulse differs from the ideal pulse by a pulse-dependent unitary 
operator, this can be observed in the distribution of the Pj,i^,m over m. In this case, the final state 
of each implemented pulse sequence is pure. The deviation of these pure states from the expected 
states is distributed over the Bloch sphere as m and j are varied. For example, consider the case 
where pi^ is close to 1/2. If the errors are systematic, the Pj,i^.,rn are distributed as the probability 
amplitude of |1) for a random pure state. In particular, we are likely to find many instances of j 
and m where Pj,i^,m is close to or 1, that is, differs significantly from 1/2. In contrast, if the error 
is depolarizing, the Pj,ik,rn are all close to 1/2 independent of j and m. 



III. TRAPPED-ION-QUBIT IMPLEMENTATION 

We determined the computationally relevant error probabilities for computational gates on one 
qubit in an ion trap. The qubit was represented by two ground-state hyperfine levels of a ^Be"*" 
ion trapped in a linear radio-frequency Paul trap briefly described in [|25I1 . It is the same trap that 



has been used in a several quantum information processing experiments S Hi Hi Hi, [30]. The 



two qubit states are ||) (F = 2, nip = —2) and If) (F = l,mp = —1), where for our purposes, 
we identify ||) with |0) and |t) with |1). The state ||) is prepared by optical pumping, after 
laser cooling the motional states of the ion. We can distinguish between and |t) by means of 
state-dependent laser fluorescence. Computational gates and Pauli pulses involving x- or y-axis 
rotations were implemented by means of two-photon stimulated Raman transitions. To ensure that 
the pulses were not sensitive to the remaining excitations of the motional degrees of freedom, we 
used copropagating Raman beams. It was therefore not necessary to cool to the motional ground 
state and only Doppler cooling was used. Pulses involving 2;-axis rotations were implemented by 
programmed phase changes of one of the Raman beams. This changes the phase of the rotating 
reference frame and is equivalent to the the desired z-axis rotation. The 2;-axis rotations were 
accompanied by a delay equivalent to the correponding x and y pulses. 

The Raman beams were switched on and off and shifted in phase and frequency as necessary 
by means of acousto-optic modulators controlled by a field-programmable gate array (FPGA). The 
pulse sequences were written in a special-purpose pulse-programming language and precompiled 
onto the FPGA. The version of the FPGA in use for the experiments was limited to about 100 com- 
putational pulses. The longest sequence in our experiments consisted of 96 computational gates. 
Our initial implementations clearly showed the effects of systematic errors in the distribution of 
the error probabilities of individual sequences. This proved to be a useful diagnostic and we were 
able to correct these systematics to some extent. One of the largest contributions to systematic 
errors was due to Stark shifts. To correct for for these shifts, we calibrated them and adjusted 
phases in the pulse sequences. 
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IV. EXPERIMENTAL RESULTS 

We generated Nq = 4 random computational sequences and truncated them to the Ni = 
17 lengths {2, 3, 4, 5, 6, 8, 10, 12, 16, 20, 24, 32, 40, 48, 64, 80, 96}. Each truncated sequence was 
Pauli randomized Np = 8 times. Each final pulse sequence was applied to an ion a total of 8160 
times in four groups that were interleaved with the other experiments in a randomized order. Pulse 
durations, qubit-re sonant frequencies and Stark shifts were recalibrated automatically at regular 
intervals. The number of experiments per pulse sequence was sufficient to obtain the probabil- 
ity of incorrect measurement outcome with a statistical error small compared to the variation due 
to randomization and systematic errors. Fig. [T] plots the fidelity (one minus the probability of 
incorrect measurement outcome) of each of the 4 * 17 * 8 = 544 final pulse sequences against 
the length of the corresponding computational sequence. As explained in the figure caption, the 
variation in fidelity for each length shows that non-depolarizing errors contribute significantly to 
error. Fig. [2] plots the average fidelity over the eight Pauli randomizations of each computational 
sequence truncated to the different lengths. Pauli randomization removes coherent errors, signif- 
icantly reducing the variation in fidelities for different computational sequences. The remaining 
variation could be due to the small sample of 8 Pauli randomizations used to obtain the average. 
The empirical average probability of error per randomized computational gate can be obtained by 
fitting the exponential decay and was found to be 0.00482(17). The fit was consistent with a simple 
exponential decay, which suggests that these gates behave similarly in all computational contexts. 



The error bars represent standard deviation as determined by nonparameteric bootstrapping Bill . 
In what follows, if the fits are good, error bars are determined from nonlinear least-squares fits. In 
the cases where we can obtain a useful estimate of an error per randomized computational gate but 
the fits are poor, we used nonparameteric bootstrapping. 

For our experimental setting, it is possible to perform experiments to quantify the different 
types of errors as a consistency check. The results of these experiments are in App. |A] and are 
consistent with the randomized benchmarking data. 



V. THEORETICAL CONSIDERATIONS 

The average error per randomized computational gate is obtained by fitting an exponential. For 
general error models, it is possible that the initial behavior of the measured error probabilities does 
not represent the average error of interest, and it is the eventual decay behavior that is of interest. In 
this case, randomized benchmarking determines an asymptotic average error probability (AAEP) 
per randomized computational gate. It is desirable to relate the empirical AAEP to the average 
error probability (AFP) of a single randomized computational gate. As discussed above, the AAEP 
agrees with the AFP if the error of all operations is depolarizing and independent of the gates. It 
can be seen that for depolarizing errors, this relationship holds even if the error depends on the 
gates. In general, one can consider error models with the following properties: 

Memoryless errors. The errors of each gate are described by a quantum operation. In 
particular, the "environment" for errors in one gate is independent of that in another. 

Independent errors. For gates acting in parallel on disjoint qubits, each gate's errors are 
described by a quantum operation acting on only that gate's qubits. 



Stationary errors. The errors depend only on the gate, not on where and when in the 
process the error occurs. 



7 



o 






00 
d 



o 



\ I 



V 



V 

o 



CD 

d 







20 



40 



60 



80 



100 



Number of computational gates 



FIG. 1: Fidelity as a function of the number of steps for each randomized sequence. The fidelity 
(1 — prob. of error) is plotted on a logarithmic scale. The fidelity for the final state is measured for each 
randomized sequence. There are 32 points for each number of steps, corresponding to 8 randomizations of 
each of four different computational sequences. Different symbols are used for the data for each computa- 
tional sequence. The standard error of each point is between 0.001 (near fidelities of 1) and 0.006 (for the 
smaller fidelities). The scatter greatly exceeds the standard error, suggesting that coherent errors contribute 
significantly to the loss of fideUty. 

Subsystem preserving errors. The errors cause no leakage out of the subsystem defining 



Although the AAEP need not be identical to the AEP, we conjecture that there are useful bounds 
relating the two error probabilities. In particular, if the AAEP is zero then there is a fixed logical 
frame in which the AEP is zero. Trivially, if the AEP is zero, then the AAEP is zero. 

Randomized benchmarking involves both Pauli randomization and computational gate random- 
ization. The expected effect of Pauli randomization is to ensure that, to first order, errors consist of 
random (but not necessarily uniformly random) Pauli operators. Computational gate randomiza- 
tion ensures that we average errors over the Clifford group. If, as in our experimental implemeta- 
tion, the computational gates generate only the Clifford group, it takes a few steps for the effect to 
be close to averaging over the Clifford group. This process is expected to have the effect of making 
all errors equally visible to our measurement, even though the measurement is fixed in the logical 
basis and the last step of the randomized computation is picked so that the answer is deterministic 
in the absence of errors. 



the qubits. 



VI. BENCHMARKING MUTLIPLE QUBITS 



Scalable quantum computing requires not only having access to many qubits, but also the ability 
to apply many low-error quantum gates to these qubits. The error behavior of gates should not 
become worse as the computation proceeds. Randomized benchmarking can verify the ability to 
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FIG. 2: Average fidelity as a function of the number of steps for each computational sequence. The points 
show the average randomized fidelity for four different computational gate sequences (indicated by the 
different symbols) as a function of the length. The average fidelity is plotted on a logarithmic scale. The 
middle line shows the fitted exponential decay. The upper and lower line show the boundaries of the 68 % 
confidence interval for the fit. The standard deviation of each point due to measurement noise ranges from 
0.0004 for values near 1 to 0.002 for the lower values, smaller than the size of the symbols. The empirical 
standard deviation based on the scatter in the points shown in Fig. [T]ranges from 0.0011 to 0.014. The slope 
implies an error probability of 0.00482(17) per randomized computational gate. The data is consistent with 
the gate's errors not depending on position in the sequence. 



apply many multiqubit gates consistently. 

Randomized benchmarking can be applied to two or more qubits by expanding the set of com- 
putational gates to include multiqubit gates. The initial state is |0 . . . 0). Pauli randomization is 
performed as before and is expected to convert the error model to probabilistic Pauli errors to first 
order. Because the size of the Clifford group for two or more qubits is large, one cannot expect 
to effect a random Clifford group element at each step. Instead, one has to rely on rapid mixing 
of random products of generators of the Clifford group to achieve (approximate) multiqubit de- 
polarization. The number of computational steps that is required for approximate depolarization 
depends on the computational gate set. An example of a useful gate set consists of controlled 
NOTs (alternatively, controlled sign flips) combined with major- axis 7r/2 pulses on individual 
qubits. By including sufficiently many one-qubit variants of each gate, one can ensure that each 
step's computational gates are randomized in the product of the one-qubit Clifford groups. This 
already helps: It has the effect of equalizing the probability of Pauli product errors of the same 
weight (see ^^). 

The one-qubit randomized benchmark has a last step that ensures a deterministic answer for 
the measurement. For n > 1 qubits, one cannot expect deterministic answers for each qubit's 
measurement, as this may require too complex a Clifford transformation. Instead, one can choose 
a random Pauli product that stabilizes the last state and apply a random product of one-qubit n/2 
pulses with the property that this Pauli product is turned into a product of operators. If there is 
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no error, measuring cr^ for each qubit and then computing the appropriate parity of the measure- 
ment outcomes gives a known deterministic answer. With error, the probability of obtaining the 
wrong parity can be thought of as a one-qubit error probability p for the sequence. If the error 
is completely depolarizing on all qubits, with depolarization probability d, then p = d/2, just as 
for one qubit. One expects that for sufficiently long sequences, p increases exponentially toward 
1 /2 so that the asymptotic average error probability per randomized computational gate can be 
extracted as for one qubit. 
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APPENDIX A: DIRECT ERROR CHARACTERIZATIONS 

We performed experiments to directly quantify the different types of errors in our pulses. These 
experiments characterize only the initial error (the error of the first gates) and serve as a consistency 
check for the randomized benchmarking data. 

Known sources of errors include (a) phase errors due to fluctuating magnetic fields and changes 
in path length between the two Raman beams (they are merged on a polarizing beamsplitter before 
targetting the ion), (b) amplitude errors due to changes in beam position at the ion and intensity 
fluctuations not compensated by the "noise eaters" (active beam intensity stabilization), and (c) 
spontaneous emission from the upper levels required for the stimulated Raman transition. 

Phase decoherence can be measured by observing the decay of signal in a Ramsey spectrometry 
experiment of the qubit with or without refocusing [32]. Fig. [3] shows the probability of observing 
|1) at the end of a refocused Ramsey experiment as a function of the delay between the first and 
last 7r/2 pulse. By fitting the initial part of the curve to an exponential decay, one can infer the 
contribution of unrefocusable phase error to each step of the Pauli randomized sequences. We 
obtained an estimate of 0.0037(1) for this contribution. Fig. |4] shows the probability of observing 
1 1) in a similar experiment but with the refocusing pulse omitted. This is an on-resonance Ramsey 
experiment. The fit suggests a contribution of 0.0090(7) for the error per step. This is larger than 
the inferred error from the randomized experiments, which can be explained by the refocusing 
effects of the Pauli randomization. See the caption of Fig.|4]for a discussion of fitting issues. We 
note that our benchmarking experiments, as well as the error characterizations in this section, were 
performed without line-triggering the experiments, thereby making them sensitive to phase shifts 
caused by 60 hz magnetic field fluctuations. Greatly improved decoherence times are typically 
obtained if such triggering is used. 

The contribution of spontaneous emission to phase decoherence can be determined by a refo- 
cused Ramsey experiment where the two Raman beams are on separately half the time during the 
intervals between the pulses [32]. To determine the desired contribution, the probability of |1) as 
a function of time is compared to the data shown in Fig. |3l The results of the comparisons are in 
Fig. m The inferred contribution to the error probability per step is 0.00038(3), well below the 
contribution of the other sources of error. 

The effect of amplitude fluctuations can be estimated from the loss of visibility of a Rabi 
flopping experiment. The data are shown in Fig.[6l Modeling the Rabi flopping curve is non-trivial 
and the fits are not very good. Nevertheless, we can estimate a contribution to the error probability 
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FIG. 3: Measurement of phase decoherence with refocusing. We measured the probabiUty of |1) as a 
function of time for the standard refocused decoherence measurement. The pulse sequence consisted of 
a 7r/2 pulse at phase followed by a delay of T/2, a tt pulse at phase tt, another delay of T/2 and a 
final 7r/2 pulse at phase vr. The straight line shows the fit for exponential decay on the interval from 1 to 
200 fis. Its extrapolation to larger times is shown dashed. The deviation from an exponential decay at larger 
times can be attributed to slow phase drifts that are no longer refocused by the single vr pulse in the pulse 
sequence. From the fit, the contribution of unrefocusable phase decoherence to the error probability per step 
is 0.0037(1). The standard deviation of the plotted points ranges from 0.002 for values near 1 to 0.008 for 
the smallest values, similar to the apparent scatter of the plotted points. 

per step from the behavior of the curve during the first few oscillations. This gives a contribution of 
0.006(3), consistent with the probability of error per step obtained in the randomized experiments. 
Note that the contribution measured here also includes errors due to phase fluctuations during the 
computation pulses. 
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FIG. 4: Measurement of phase decoherence without refocusing. The randomized benchmark does not sys- 
tematically refocus changes in frequency. To estimate the contribution to error from decoherence including 
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FIG. 6: Rabi flopping experiment. To determine the contribution to the probabiUty of error per step due 
to pulse area error and associated decoherence, we performed a Rabi flopping experiment. We fitted the 
points to a decaying cosine curve with a possible phase offset and both Unear and quadratic decay. Again, 
we restricted the fit to an initial segment of the data (black curve). The extrapolation (dashed curve) shows 
significant deviations. The random uncertainty in the points ranges from 0.002 to 0.007, less than the 
symbol size of the plotted points. The apparent scatter in the points near the end of the curve is likely due 
to slow fluctuations in pulse amplitude. The contribution to the probability of error per step as detected in 
this experiment is 0.006(3) if the calibration were based on this experiment. Automatically calibrated pulse 
times fluctuated by around 0.02 /xs. For pulse times differing by this amount, the contribution to the error 
per step is 0.007(3). 
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